object part
Embodiment-Agnostic Action Planning via Object-Part Scene Flow
Tang, Weiliang, Pan, Jia-Hui, Zhan, Wei, Zhou, Jianshu, Yao, Huaxiu, Liu, Yun-Hui, Tomizuka, Masayoshi, Ding, Mingyu, Fu, Chi-Wing
Observing that the key for robotic action planning is to understand the target-object motion when its associated part is manipulated by the end effector, we propose to generate the 3D object-part scene flow and extract its transformations to solve the action trajectories for diverse embodiments. The advantage of our approach is that it derives the robot action explicitly from object motion prediction, yielding a more robust policy by understanding the object motions. Also, beyond policies trained on embodiment-centric data, our method is embodiment-agnostic, generalizable across diverse embodiments, and being able to learn from human demonstrations. Our method comprises three components: an object-part predictor to locate the part for the end effector to manipulate, an RGBD video generator to predict future RGBD videos, and a trajectory planner to extract embodiment-agnostic transformation sequences and solve the trajectory for diverse embodiments. Trained on videos even without trajectory data, our method still outperforms existing works significantly by 27.7% and 26.2% on the prevailing virtual environments MetaWorld and Franka-Kitchen, respectively. Furthermore, we conducted real-world experiments, showing that our policy, trained only with human demonstration, can be deployed to various embodiments.
V-MAO: Generative Modeling for Multi-Arm Manipulation of Articulated Objects
Manipulating articulated objects requires multiple robot arms in general. It is challenging to enable multiple robot arms to collaboratively complete manipulation tasks on articulated objects. In this paper, we present $\textbf{V-MAO}$, a framework for learning multi-arm manipulation of articulated objects. Our framework includes a variational generative model that learns contact point distribution over object rigid parts for each robot arm. The training signal is obtained from interaction with the simulation environment which is enabled by planning and a novel formulation of object-centric control for articulated objects. We deploy our framework in a customized MuJoCo simulation environment and demonstrate that our framework achieves a high success rate on six different objects and two different robots. We also show that generative modeling can effectively learn the contact point distribution on articulated objects.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Japan (0.04)
Unsupervised Learning of Neural Networks to Explain Neural Networks (extended abstract)
Zhang, Quanshi, Yang, Yu, Wu, Ying Nian
This paper presents an unsupervised method to learn a neural network, namely an explainer, to interpret a pre-trained convolutional neural network (CNN), i.e., the explainer uses interpretable visual concepts to explain features in middle conv-layers of a CNN. Given feature maps of a conv-layer of the CNN, the explainer performs like an auto-encoder, which decomposes the feature maps into object-part features. The object-part features are learned to reconstruct CNN features without much loss of information. We can consider the disentangled representations of object parts a paraphrase of CNN features, which help people understand the knowledge encoded by the CNN. More crucially, we learn the explainer via knowledge distillation without using any annotations of object parts or textures for supervision. In experiments, our method was widely used to interpret features of different benchmark CNNs, and explainers significantly boosted the feature interpretability without hurting the discrimination power of the CNNs.
- North America > United States > California > Los Angeles County > Los Angeles (0.15)
- Asia > China > Shanghai > Shanghai (0.05)
Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds
Sung, Jaeyong, Jin, Seok Hyun, Saxena, Ashutosh
There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory. In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot can even manipulate objects it has never seen before.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.04)
Optimization in Differentiable Manifolds in Order to Determine the Method of Construction of Prehistoric Wall-Paintings
Arabadjis, Dimitris, Rousopoulos, Panayiotis, Papaodysseus, Constantin, Exarhos, Michalis, Panagopoulos, Michalis, Papazoglou-Manioudaki, Lena
In this paper a general methodology is introduced for the determination of potential prototype curves used for the drawing of prehistoric wall-paintings. The approach includes a) preprocessing of the wall-paintings contours to properly partition them, according to their curvature, b) choice of prototype curves families, c) analysis and optimization in 4-manifold for a first estimation of the form of these prototypes, d) clustering of the contour parts and the prototypes, to determine a minimal number of potential guides, e) further optimization in 4-manifold, applied to each cluster separately, in order to determine the exact functional form of the potential guides, together with the corresponding drawn contour parts. The introduced methodology simultaneously deals with two problems: a) the arbitrariness in data-points orientation and b) the determination of one proper form for a prototype curve that optimally fits the corresponding contour data. Arbitrariness in orientation has been dealt with a novel curvature based error, while the proper forms of curve prototypes have been exhaustively determined by embedding curvature deformations of the prototypes into 4-manifolds. Application of this methodology to celebrated wall-paintings excavated at Tyrins, Greece and the Greek island of Thera, manifests it is highly probable that these wall-paintings had been drawn by means of geometric guides that correspond to linear spirals and hyperbolae. These geometric forms fit the drawings' lines with an exceptionally low average error, less than 0.39mm. Hence, the approach suggests the existence of accurate realizations of complicated geometric entities, more than 1000 years before their axiomatic formulation in Classical Ages.
- Europe > Middle East > Cyprus > Akrotiri (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > Greece > Ionian Islands > Corfu (0.04)
- Europe > Greece > Attica > Athens (0.04)